A MFCC-based CELP Speech Coder for S in Network Enviro
نویسندگان
چکیده
Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is expected that the performance of speech recognition using the proposed speech coder is better than that using G.729.
منابع مشابه
A MFCC-based CELP speech coder for server-based speech recognition in network environments
Existing standard speech coders can provide high quality speech communication. However, they tend to degrade the performance of automatic speech recognition (ASR) systems that use the reconstructed speech. The main cause of the degradation is in that the linear predictive coefficients (LPCs), which are typical spectral envelope parameters in speech coding, are optimized to speech quality rather...
متن کاملA bitrate and bandwidth scalable CELP coder
This paper proposes a flexible CELP speech coder with bitrate and bandwidth scalabilities for multimedia applications. The coder is based on multi-pulse-based CELP coding and consists of a bitrate scalable base-band coder and a bandwidth extension tool. The bitrate scalable base-band CELP coder employs multi-stage excitation coding based on an embedded-coding approach. The multipulse excitation...
متن کاملLow - Delay Speech Coders at 16 kb / s : - 4 CELP and A Tree Coder
For speech codem to be used in network applications, a transparent or near transparent quality (Mean Opinion Score rating of 4.0) is required. Though this is a necessary criterion, other desired properties include: low-delay, robustnem to channel errors, moderate complexity, capability to handle non-speech s i p nals in the telephone band, and good tandeming ~~~~~e. The CCITT's current consider...
متن کاملA 16 kb/s Wideband CELP-Based Speech Coder Using Mel-Generalized Cepstral Analysis
We propose a wideband CELP-type speech coder at 16 kb/s based on a mel-generalized cepstral (MGC) analysis technique. MGC analysis makes it possible to obtain a more accurate representation of spectral zeros compared to linear predictive (LP) analysis and take a perceptual frequency scale into account. A major advantage of the proposed coder is that the benefits of MGC representation of speech ...
متن کاملA wideband CELP speech coder at 16 kbit/s based on mel-generalized cepstral analysis
This paper proposes a wideband CELP coder using frequency warping. Instead of linear prediction, the proposed coder adopts the melgeneralized cepstral analysis, and encodes fullband of the speech signal through a warped frequency scale. It is shown that the subjective quality of the proposed coder at 16 kbit/s is better than that of the ITU-T G.722 at 64 kbit/s. Furthermore, the proposed coder ...
متن کامل